Informatics in Medicine Unlocked — Latest Matching Preprints

1

Next-Generation Skin Cancer Detection Using Efficient Fuzzy Fusion of Genomic and Imaging Data

Molla, A. R.; Maity, A.; Saha, S.; Bhattacharya, R.; Chakraborty, A.; Biswas, S.; Nath, S.

2026-06-08 health informatics 10.64898/2026.06.05.26355024 medRxiv

Top 0.1%

5.0%

Show abstract

Skin cancer requires early detection for improved survival rates. Most existing methods rely on deep learning based image classification, which is affected by visual similarity among lesions. Fewer studies use Gene Expression (GE) analysis, which captures molecular characteristics but lacks structural and visual details. To overcome limitations of individual modalities, this paper proposes a multimodal framework integrating dermoscopic images and GE profiles for skin cancer classification. EfficientNet and logistic regression are used for image based analysis and genomic skin lesion profiling, respectively, followed by fuzzy rule based decision systems to reduce uncertainty within individual modalities. Finally, fuzzy fusion combines predictions from both modalities using uncertainty based weighting of classifier outputs. The experimental findings show that both the image based and GE based classification models individually achieved accuracies of nearly 92%. However, the integration of prediction results through the proposed fuzzy fusion strategy further enhanced the classification performance, achieving an overall accuracy of 94.25%. The results obtained outperform contemporary methods, highlighting the effectiveness of combining complementary multimodal information compared with single modality approaches.

2

Assessment of the accuracy of lung lesions diagnosis in adolescents with osteosarcoma using artificial intelligence

Uskova, N. G.; Gombolevskiy, V. A.; Chernina, V. Y.; Burenchev, D. V.; Akhaladze, D. G.; Panina, E. V.; Karachunskiy, A. I.; Tereschenko, G. V.; Goncharov, M. Y.; Soboleva, E. A.; Konopleva, E. I.; Bydanov, O. I.; Plekhov, S. Y.; Grachev, N. S.

2026-06-10 radiology and imaging 10.64898/2026.06.08.26354011 medRxiv

Top 0.3%

2.4%

Show abstract

Background. Lung metastases in osteosarcoma (OS) are the main cause of the death. The accuracy of the diagnosis of nodules by computed tomography (CT) of the lungs is critically important for determining the disseminated stage of the disease and planning surgical treatment. The use of artificial intelligence (AI) in the search for lung nodules increases the accuracy of diagnosis and reduces the chance of missing metastases. Objective: to evaluate the accuracy of lung nodules diagnosis in adolescents with OS using AI. Methods. A retrospective assessment of CT scans of adolescents with OS was performed. A pathological nodule with an average size of [≥]4 mm was considered a target finding. The diagnostic accuracy of an AI algorithm previously trained on an adult dataset was evaluated, and the number of false positives (FP) and false negatives (FN) was determined. Sensitivity, specificity, accuracy, area under the ROC curve (AUC), positive predictive value, negative predictive value, and F1-measure were calculated. Based on the obtained results, the effectiveness of the algorithm was assessed. Results. 248 CT scans of adolescents with OS were evaluated. The following results were obtained: in 5 cases, the AI algorithm showed a FP result (2.02%), in 34 cases, it showed a FN result (13.71%), and in 209 cases, a correct result (both true positive and true negative) (84.27%). The diagnostic accuracy of the algorithm was 0.843 (95% CI 0.794-0.887). The application of the AI algorithm in the practice of an X-ray doctor in a specific clinical task would allow to increase the sensitivity from 0.805 to 0.891, while ensuring an absolute decrease in the number of FN results by 8.59% and a relative decrease by 44%. Conclusion. The obtained results confirm the practical value of the application of the AI algorithm and justify the implementation of AI-assisted systems in the diagnostic protocols for lung metastases in adolescents with OS.

3

A Consensus-Driven Stacking Ensemble Framework for Interpretable Cardiovascular Risk Prediction and Clinical Deployment

Sozol, S. S.; Dev Nath, B. C.; Fahim, F. M. S.; Suzana, N. N.; Mirza, J. F.; Ahmmed, S.; Zohra, F.-T.; Zafr, A. H. A.; Uddin, M. N.; Mondal, M. R. H.; Hoque, A. S. M. L.

2026-05-26 health informatics 10.64898/2026.05.18.26352989 medRxiv

Top 0.3%

1.9%

Show abstract

Machine learning (ML) is being considered to help diagnose cardiovascular diseases (CVD). Still, challenges like inconsistent and limited datasets, limited infrastructure, and global inequalities lead to the need for a reliable and practicable ML solution. This paper presents an ML-driven framework for predicting CVD risk scores and classifying status. Several data preprocessing techniques, including multiple imputation by chained equations (MICE), outlier removal, are considered. In addition, hyperparameter tuning is performed with the GridSearchCV tuning technique. Moreover, a consensus-driven five-feature selection method is applied to identify optimal predictors. The dataset used in this study contains healthcare records related to future CVD risk scores, comprising 1,529 patient records with 22 features. The optimized stacked ensemble model is applied to the dataset and achieves a cross-validated coefficient of determination value of 98.13% for CVD risk score regression. Comparative evaluation with other ML models confirmed improved accuracy, efficiency, and interpretability. The explainable AI technique SHAP is applied to interpret predictions and highlight key risk factors. Moreover, a deployment-ready web platform with multi-role access has been developed that demonstrates clinical applicability. The proposed framework offers a reliable and interpretable tool for early detection of CVD and personalized risk assessment. In the future, this work can be extended to integrate longitudinal data, medical imaging, and deep learning to improve generalizability and strengthen real-world impact.

4

Automated identification of bolus types in modified barium swallow studies using deep learning: a preliminary study

Mao, S.; Sahli, A. J.; Buoy, S. N.; Hutcheson, C.; Gelabert, G. A.; Barbon, C. E. A.; Naser, M. A.; Fuller, C. D.; Brock, K. K.; Hutcheson, K. A.

2026-05-20 radiology and imaging 10.64898/2026.05.16.26353385 medRxiv

Top 0.4%

1.7%

Show abstract

Purpose: Modified Barium Swallow (MBS) studies utilize videofluoroscopy, a dynamic X-ray technique for evaluating swallowing anatomy and physiology. Each MBS exam typically includes multiple bolus trials, often involving different bolus consistencies. Accurate classification of bolus types is essential, as swallowing dynamics, aspiration risks, and residue levels vary with bolus consistency. In this preliminary study, we propose a deep learning-based approach for automated bolus type classification in MBS, aiming to provide a standardized and efficient framework for automated processing of swallowing assessments. Methods: A total of 206 patients (Mean +/- SD age: 60.24 +/- 9.02 years; 89.32% men) underwent MBS examinations, comprising 277 individual MBS studies. The dataset included 2,752 bolus-level video segments, categorized by bolus type as follows: 1,711 liquid (IDDSI 0-3, 62.17%), 521 pudding (IDDSI 4, 18.93%), and 520 solid boluses (IDDSI 7, cookie or cracker, 18.89%). To standardize variable video lengths for the data pipeline, each MBS video was temporally segmented into a fixed-length frame sequence, with shorter videos padded using static frames and longer videos randomly cropped to the target length. We employed an Inflated 3D convolutional neural network to develop the deep learning model. Results: Each video segment contained an average of 273.03 +/- 195.81 frames. On the independent test set, the deep learning model achieved an overall accuracy of 96.13%, and the macro F1-score was 95.05% in classifying food bolus types within MBS videos. Conclusions: The developed AI-based system demonstrated effective automated classification of food bolus types in MBS videos, representing an important step toward fully automated MBS analysis for swallowing efficiency assessment. The AI model reduces the reliance on manual labels, thereby promising to streamline clinical and research workflows.

5

The Inflation Reduction Act's Impact Upon Late-Stage R&D

Bowen, H. P.; O'Loughlin, G.; Schleicher, C.; Schulthess, D.

2026-05-28 health economics 10.64898/2026.05.20.26353648 medRxiv

Top 0.5%

1.5%

Show abstract

Background: The impact of the Inflation Reduction Act (IRA) upon late-stage developments has been assumed to be limited. The Congressional Budget Office's IRA analysis excluded post-approval innovation, potentially overlooking substantial economic risks to drug developers and declines in the availability of treatments in areas of high unmet medical need such as oncology. Methods: A total of 1148 secondary trials from 364 FDA-approved medicines, published from 2018 to 2025, were obtained from Biomedtracker and clinicaltrials.gov. Using fractional multinomial logit, we model the share distribution of secondary indication studies across 19 disease groups and assess the change in this distribution post-IRA. We also assessed the number of secondary treatment studies pre- vs. post-IRA using multiple linear regression. Results: After the IRA's introduction, small molecule follow-on studies in oncology exhibited a statistically significant 35% decline (R2 = .48, p < 0.014) and lead indication small molecule oncology approvals exhibited a statistically significant 27% decline (R2 = .70, p < 0.002). We also find a statistically significant 14% decline in the share of orphan oncology studies pre- vs. post-IRA (p<0.001). Research Conclusions: This study's results refute claims that the IRA would have minimal negative effects on patient access or late-stage biopharmaceutical R&D. We hope this study reinvigorates debate about the law's unintended consequences and encourages thoughtful policy solutions, as the IRA manifestly creates disincentives that negatively impact patients seeking needed new medicines, particularly those requiring cures addressing metastatic late-stage cancers.

6

FAMES: Federated additive model using piecewise exponential survival data

Islam, N.; Luo, C.; Tong, J.; Weller, G.; Polleya, D. A.; Kent, A.; Bair, S.

2026-05-19 health informatics 10.64898/2026.05.15.26353335 medRxiv

Top 0.6%

1.2%

Show abstract

Introduction In analyses of time-to-event data, clinical characteristics can have non-linear impacts on survival outcomes, and understanding this dynamic behavior is crucial for producing real-world evidence (RWE). Nonetheless, estimating these dynamic effects is inherently challenging when utilizing real-world data (RWD), especially since sharing individual-level patient data (IPD) is heavily restricted due to regulatory limitations. Additionally, computational difficulties are exacerbated by the high dimensionality, inter-dependency, rarity, sparsity, and scarcity of features. While data augmentation through collaboration across multiple sites might address these challenges, such collaboration is often infeasible and hindered by regulatory measures that protect patient privacy, thereby preventing the sharing of IPD between sites. Objectives To address this challenge, we propose a privacy-preserving regularized algorithm that eliminates the necessity of aggregating any protected health information across sites. This algorithm employs a penalized federated additive model utilizing piecewise exponential survival (FAMES) data and estimates non-linear effects of features while accounting for non-varying confounding effects. The model is flexible and can accommodate both multiple and multivariate smooth effects simultaneously. Methods The proposed model transforms survival data into a piecewise exponential data (PED) structure and casts the semi-parametric optimization problem into a generalized additive modeling framework assuming Poisson distribution. The model uses orthonormal splines to approximate non-linear effects and incorporates L2-norm based penalty terms to control the smoothness and goodness-of-fit of these effects. The algorithm is optimized using site-specific aggregated summary statistics and is solved iteratively through the Newton-Raphson method. Results The model is employed to assess the smooth effects of clinical features, such as age and numeric laboratory values, on overall survival using RWD from approximately 874 newly diagnosed Acute Myeloid Leukemia (AML) patients treated at seven distinct sites in the United States. The model exhibited non-linear smooth effects for lactate dehydrogenase, platelets, and others underscoring their strong association with disease prognosis. The model demonstrates a lossless property, providing estimates of smooth and fixed effects that are comparable to those derived from the pooled PED. Additionally, the inference of parameters for testing the nullity of effects remains consistent. This model is communication-efficient, necessitating roughly twelve rounds of communication across sites. Conclusion We anticipate that this model can facilitate multisite collaboration and enable smaller sites to participate in generating and validating RWE, especially for rare diseases. While the model was applied within the context of AML, it is disease-agnostic and can be implemented in any other clinical context and across various sites globally without losing any generality.

7

Development and Prospective Validation of Predictive Model for Early Hemodynamic Deterioration in Critical Care: A Multicenter Study

Nagori, A.; Singh, P.; Firdos, S.; Devadiga, A.; Vats, V.; Gupta, A.; Bandhey, H.; Ailavadi, P.; Awasthi, R.; Narotam, N.; Mishra, A.; Lodha, R.; Sethi, T.

2026-06-10 intensive care and critical care medicine 10.64898/2026.06.05.26353765 medRxiv

Top 0.7%

1.1%

Show abstract

High-frequency physiological monitoring in ICUs can identify impending deterioration hours before clinical recognition yet extracting reliable early-warning signals from noisy vital-sign streams remains challenging. We present SIgnose, an interpretable prediction framework for early detection of abnormal shock index (SI), built from routinely monitored vital signs using physiologic variability and nonlinear time-series features. SIgnose was developed on the eICU Collaborative Research Database and externally validated on the MIMIC-III adult database and a pediatric SafeICU cohort (AIIMS New Delhi), with additional prospective validation in the pediatric ICU. We benchmarked three representation strategies: (i) engineered physiologic variability and nonlinear time-series features, (ii) deep learning, and (iii) Llama-3.1-8B embeddings with low-rank adaptation. Physiologic variability features consistently demonstrated superior cross-cohort generalization. The final model used 3,970 features from five vital signs to predict abnormal SI up to 8 hours ahead, achieving AUROC 0.861 (95% CI 0.859-0.863) and AUPRC 0.927 (95% CI 0.925-0.929) on eICU. External validation yielded AUROC 0.870 (95% CI 0.863-0.876) and AUPRC 0.935 (95% CI 0.930-0.940) on MIMIC-III, and AUROC 0.875 (95% CI 0.863-0.888) and AUPRC 0.915 (95% CI 0.898-0.930) on SafeICU; prospective pediatric validation (n = 88) achieved AUROC 0.885 (95% CI 0.868-0.902) and AUPRC 0.911 (95% CI 0.882-0.936). SHAP interpretability analysis identified heart rate variability, respiratory trend dynamics, and multi-scale blood pressure variability as key early-warning signatures. These findings establish SIgnose as a reproducible, low-compute, early-warning framework and demonstrate that physiologic variability features provide robust, generalizable representations for early deterioration detection across adult and pediatric critical care.

8

Case-level artificial intelligence for multi-photo teledermatology submissions: development and internal validation using patient-submitted dermatology images

Patel, V. P.; Sheth, N.; Patel, A.; Patel, Y.

2026-06-01 dermatology 10.64898/2026.05.21.26353816 medRxiv

Top 0.7%

1.0%

Show abstract

Background: Store-and-forward teledermatology commonly relies on several patient-submitted photographs of the same concern, but most dermatology artificial intelligence models classify single images independently. Objective: To develop and internally validate a case-level diagnostic-support model that aggregates multiple patient-submitted photographs for common dermatologic conditions. Methods: We conducted a retrospective diagnostic-modeling study using the Skin Condition Image Network, a public dataset of deidentified self-taken dermatology images from US adults. We curated 2,336 cases comprising 5,041 images across 10 common inflammatory, allergic, and infectious conditions. Cases were split at the submission level into training, validation, and held-out test sets. Frozen general-purpose and dermatology-specific encoders were compared with image-level classifiers and a gated-attention multiple instance learning model that generated one case-level output from 1-3 images. Results: The strongest image-level baseline, dermatology-specific embeddings with random forest classification, achieved macro/micro ROC-AUCs of 0.797/0.854. Case-level aggregation improved discrimination, with dermatology-specific embeddings plus multiple instance learning achieving mean macro/micro ROC-AUCs of 0.819/0.863 across repeated stratified experiments. The locked final model achieved macro/micro ROC-AUCs of 0.800/0.849 on the held-out test set. Balanced-threshold sensitivity/specificity examples were 0.702/0.688 for eczema and 0.818/0.826 for urticaria. Limitations: Internal validation used a 10-condition subset from a US volunteer dataset; external validation, calibration, subgroup performance analysis, and prospective workflow studies are required. Conclusion: Modeling the teledermatology submission as a multi-image case better reflects asynchronous dermatology workflow than single-image classification. The model is preliminary clinician-facing support for structured review and triage, not autonomous diagnosis.

9

Can Artificial Intelligence Match Dermoscopy in Melanoma Detection? Evidence from a Systematic Review and Meta-analysis of Pigmented Skin Lesions

Tang, H.; Zhu, Y.; Diao, M.

2026-05-20 dermatology 10.64898/2026.05.15.26353363 medRxiv

Top 0.8%

0.9%

Show abstract

Accurate risk stratification of pigmented skin lesions is critical for early melanoma detection and for reducing unnecessary excisions. Artificial intelligence (AI) is increasingly applied to dermoscopic image analysis, but its diagnostic performance relative to standard dermoscopy in real-world clinical settings remains uncertain. To address this gap, we conducted a systematic review and meta-analysis of prospective clinical studies directly comparing AI alone, dermoscopy, and AI-assisted clinicians for malignancy risk assessment of pigmented skin lesions. We systematically searched PubMed, Embase, Web of Science, and Cochrane Library from inception to January 2026. Ten studies with 17 diagnostic arms (10 dermoscopy arms, 6 AI-alone arms, and 1 AI-assisted clinician arm) were included. Pooled sensitivity and specificity were 0.773 (95% CI, 0.648-0.863) and 0.793 (95% CI, 0.673-0.877) for dermoscopy, and 0.757 (95% CI, 0.428-0.928) and 0.859 (95% CI, 0.619-0.958) for standalone AI. Summary ROC curves showed overlapping performance, indicating that autonomous AI is broadly comparable to dermoscopy but does not demonstrate a consistent advantage. Heterogeneity in AI performance was driven almost entirely by threshold effects rather than by differences in inherent model capacity. AI-assisted clinicians showed promising results (sensitivity 1.000, specificity 0.837) in a single study, but more evidence is needed. Our findings suggest that, at present, AI should be viewed as a complementary decision-support tool rather than a replacement for dermoscopic evaluation. The study provides valuable evidence for clinicians, guideline developers, and researchers working on AI integration into melanoma diagnostic pathways.

10

E-InfertilityTest: An Explainable AI Framework for Male Infertility Assessment

Das, G.; Ghosh, B.; Ghosh, Z.

2026-05-25 bioinformatics 10.64898/2026.05.21.726746 medRxiv

Top 0.8%

0.9%

Show abstract

Male infertility has emerged as a significant concern in modern society, with genetic defects as one of the major underlying cause behind it. This impairment negatively impacts sperm motility and morphology, leading to conditions such as Asthenozoospermia (reduced sperm motility), Teratozoospermia (abnormal sperm morphology) and sometimes Asthenoteratozoospermia (both motility and morphology defects). Assisted reproductive technologies (ART), such as in-vitro fertilization (IVF), offer a potential solution for such cases but with a low success rate. Classical semen analysis provides only a phenotypic snapshot without revealing the fertilizing potential of the sperms. Hence, in order to screen the functional sperm population as well as to get a deeper insight into the reasons underlying the aberrant sperm population, it is important to study their genetic profile. In this work, we have performed a meta analysis of the transcriptomic data of infertile sperms from Asthenozoospermia and Teratozoospermia patients with that from fertile sperms of normal individuals. Thereafter we have screened a signature gene set which has been used to develop a prediction model named Explainable Infertility Test (E-InfertilityTest) to classify between fertile versus infertile sperm at the preliminary level. For each prediction, it will also provide the set of genes which are playing a dominant role towards such prediction. Thus, it will provide patient specific dominant gene expression profile responsible for the aberration. This work warrants validation experiments in future to substantiate the models performance in a clinical setting. User can access the tool named E-InfertilityTest as a standalone version on GitHub. Github Linkhttps://github.com/zglabDIB/einfertility.git

11

Investigation of the continuous spread of SARS-CoV-2 in the post pandemic time - Insights into the reason for the sustained spread despite the establishment of population immunity

Yi, B.

2026-06-08 epidemiology 10.64898/2026.06.05.26355009 medRxiv

Top 0.9%

0.9%

Show abstract

In spite of well-established global immune landscape, SARS-CoV-2 is still able to further spread and continue causing infection waves. The current understanding about the reason behind is limited, and it is still difficult to predict the evolution or spreading tread of SARS-CoV-2. Therefore, it is necessary to investigate whether the establishment of population immunity has changed the virus evolution or spreading pattern. In this investigation, one overall analysis of the SARS-CoV-2 spreading in the past several years have been carried out through one thorough genomic epidemiology study, with Germany being chosen as one representative location in view of the systemic efforts for genomic surveillance. The growth advantage of a few predominant variants in its early spreading period has been evaluated through a logistic regression model. The results have revealed that the major circulating SARS-CoV-2 variants since 2023 are mainly derived from the Omicron BA.2 family. Since middle of 2024, most predominant variants were produced primarily through recombination, indicating that the evolution derived from recombination might be the major driving force for the continuous spread of SARS-CoV-2 despite the existence of population immunity. Furthermore, the lower growth advantage of recently emerged variants might possibly lead to a tread of reduction in the frequency of infection wave. The information revealed from this investigation suggests that although short-term spreading tread can be affected by specific virus feature as well as local immunity landscape, the long-term spreading tread is mainly decided by the genomic diversity of the viruses, and can be predicted through phylogenetic and genomic epidemiology investigation. The results have emphasized the importance of maintaining the efforts for genomic surveillance of SARS-CoV-2, which is essential from both medical and research perspectives.

12

Combining centralized and decentralized approaches to assess and ensure data quality in Eurocrine(R) via Microsoft Power BI and DataquieR

Musholt, T. J.; Clerici, T.; Bergenfelz, A.; Schmidt, C. O.; Struckmann, S.

2026-06-05 health informatics 10.64898/2026.06.04.26354884 medRxiv

Top 0.9%

0.9%

Show abstract

Background: Medical registries have gained importance in the evaluation of healthcare quality outcomes. In the absence of high-quality evidence, such as randomized controlled trials, studies based on registry data are essential for informing clinical guidelines. Methods for assessing data quality are rarely described in detail. To ensure the credibility of registry-based studies, registries must use all available technical and operational means to guarantee high data quality. Method: Eurocrine(R) is a pan-European endocrine surgical database and quality registry initially funded by the EU healthcare programme, which started in 2015 and now includes more than 200,000 interventions as of April 2025. To ensure high data quality, interactive and standardized reports are created via Microsoft Power BI, which are created both centrally and locally. In addition, comprehensive data quality analyses were performed via the R-based package dataquieR. Results: Although a multitude of technical measures (for example, input screen design and real-time plausibility checks during data entry) are in place, they are not sufficient to prevent human errors at data entry. Errors identified in the reports were corrected, and preventive measures were implemented. Overall, the data quality was assessed as very good in terms of completeness, accuracy, and consistency. Conclusion: It is very important to provide registry users with an efficient and smart tool to identify data issues, as they have the clinical information to correct them. Data quality reports generated with dataquieR represent an effective tool for registry administrators. Predesigned Microsoft Power BI reports enable participating Eurocrine(R) clinics to self-audit their data.

13

Acceptability and Perceptions of Artificial Intelligence in Organized Breast Cancer Screening: A Study of French Women

Jean, A.; Merceron, A.; Le Saux, A.; Mercier, E.; Benillouche, P.

2026-06-09 radiology and imaging 10.64898/2026.06.07.26354883 medRxiv

Top 1.0%

0.8%

Show abstract

This study aims to assess women's perceptions of artificial intelligence (AI) used in breast cancer screening in France by examining their knowledge of AI and the barriers to their participation in organized screening. The results of a survey conducted in June 2025 among a national sample of 2000 women (aged 40-75) reveal limited participation and persistent concerns among women. Nevertheless, despite a low awareness of specific AI applications, a large majority of the women surveyed are very favorable to the use of AI in breast cancer diagnosis, even considering it a lever to increase screening participation.

14

Precision Imaging to Evaluate Kaposi Sarcoma (PRIME-KS): protocol for a multicountry novel artificial intelligence-based imaging device

Odeny, T. A.; Adhiambo, H. F.; Mangale, D.; Makanga, P. K.; Odeny, B.; Okuku, F.; Zhou, C.; Geng, E.; Carson, J.; Mudhune, V.; Bukusi, E.; Semeere, A.

2026-06-04 oncology 10.64898/2026.06.03.26354815 medRxiv

Top 1.0%

0.8%

Show abstract

Abstract Background: Kaposi sarcoma (KS) is the most common cancer among men in several Eastern African countries, yet treatment monitoring relies on imprecise, time-consuming ruler-based measurements defined by the AIDS Clinical Trial Group (ACTG). This method suffers from inter-observer variability, fails to capture lesion height or true geometric area, and performs poorly on dark skin. SkinScan3D (SS3D) is a portable, low-cost, AI-enabled 3D imaging device that provides objective measurements of KS skin lesion area, height, volume, and color. The Precision Imaging to Evaluate Kaposi Sarcoma (PRIME-KS) study evaluates whether SS3D provides more reproducible and accurate lesion measurements than the standard method, and validates its integration into routine clinical workflows in Kenya and Uganda. Methods: PRIME-KS is a multicountry prospective mixed-methods study with two clinical objectives. Objective 1 is a cross-sectional diagnostic accuracy study comparing SS3D with ruler-based measurement in 50 adults with KS (150 lesions) across sites in Kenya and Uganda. Two clinicians independently measure three lesions per participant using both methods. The primary outcomes are concordance correlation coefficient (CCC) for inter-rater reproducibility, and co-efficient of determination for accuracy. Objective 2 is a non-randomized before-and-after pilot study in 100 patients at three sites, evaluating device usability, acceptability, appropriateness, and feasibility using validated instruments, along with time-and-motion studies and activity-based micro-costing. Prior to these clinical objectives, a formative study used focus group discussions, discrete choice experiments, and human-centered design workshops to refine the SS3D device and protocols with end-user input. Discussion: PRIME-KS will provide the first rigorous evaluation of a 3D imaging device for monitoring KS treatment response in routine clinical settings. If SS3D demonstrates superior reproducibility and clinical utility, it could reduce unnecessary chemotherapy exposure and associated toxicities by enabling earlier, more objective assessment of treatment response. Trial registration: ClinicalTrials.gov NCT06898203, registered 27 March 2025. Pan African Clinical Trials Registry PACTR202603523439856. Keywords Kaposi sarcoma, SkinScan3D, 3D imaging, treatment monitoring, diagnostic accuracy, implementation science, usability, human-centered design, Kenya, Uganda

15

Early-Horizon Multimodal ICU Mortality Prediction Without Retraining

Bakumenko, A.; Smith, D. H.; Hoelscher, J.

2026-05-21 health informatics 10.64898/2026.05.18.26353392 medRxiv

Top 1%

0.8%

Show abstract

Earlier ICU mortality prediction is more clinically useful because it can identify high-risk patients while treatment decisions can still change. Yet most models are trained on data from a fixed time window, so it is unclear whether a model trained on the first 48 hours of ICU data remains reliable when used earlier in the ICU stay. We evaluated a multimodal ICU mortality model trained once at 48 hours and then applied unchanged at 6, 12, 24, and 48 hours on MIMIC-III. The model combines an LSTM for physiological time-series data, a finetuned ClinicalModernBERT model for clinical notes, and a logistic regression fusion layer. Performance remained strong at earlier time points, suggesting that useful mortality prediction is possible earlier in the ICU stay even without retraining. At 6 hours, the model achieved AUROC 0.777 and remained well-calibrated (ECE 0.038) without any recalibration, and it outperformed both single-modality models at every horizon. The multimodal benefit was most evident at earlier horizons, when physiological data were sparse: agreement between the two specialists dropped by more than half from 48 to 6 hours, while the median contribution from clinical notes increased from 37% to 49%. A Bayesian version of the fusion layer showed that uncertainty decreased for survivors as more data accumulated but remained high for non-survivors; the most uncertain cases were up to 4.9 times more likely to be non-surviving patients. Continuous hourly analyses further showed that clinical notes provide stable context between documentation events. Simply carrying forward the most recent note matched or outperformed note-decay and documentation-gap alternatives. These results suggest that a multimodal ICU mortality model trained on 48 hours of data can provide trustworthy earlier predictions without retraining, while also identifying the cases that remain hardest to interpret.

16

Positioning Early Phase CNS Trials for Regulatory and Investor Success: Strategic Implications of the Single Phase 3 Approval Paradigm

Schmidt, P.; Preskorn, S.

2026-06-08 neurology 10.64898/2026.06.05.26353604 medRxiv

Top 1%

0.8%

Show abstract

In February 2026, the FDA announced that a single pivotal phase 3 (P3) trial would become the new default standard for drug approval - a regulatory direction that had been legally enabled since the FDA Modernization Act of 1997. This announcement has strategic, scientific, and economic implications for drug developers, contract research organizations (CROs), and biotech investors. We argue that the expansion of this framework, originally reserved for various niche submissions, represents a paradigm change, dramatically increasing the value of rigorous early phase (P1 and P2) trial design, requiring sponsors to establish both statistical efficacy signals and mechanistic biological understanding before entering phase 3. Using a CNS indication cost model, we show that single P3 approval can reduce total development expenditure from approximately $447 million over 14 years to $297 million over 12 years - a savings of $150 million and providing two years of additional commercial runway for a modeled CNS drug. Case examples including lecanemab, omaveloxolone, and tofersen illustrate how biomarker-informed early phase strategies can establish the confirmatory evidence necessary for single-trial approval. We provide practical guidance for maximizing the value of P1 and P2 under this evolving framework.

17

ParaDISM: Precise mapping of short reads to genes with highly homologous regions

Tzimotoudis, D.; Farrugia, R.; Zammit, J.; Masini, M. C.; Balestrucci, A.; Carbott, F. B.; Wettinger, S. B.; Alexiou, P.; Ciach, M. A.

2026-05-21 bioinformatics 10.64898/2026.05.19.726275 medRxiv

Top 1%

0.8%

Show abstract

BackgroundGenes with highly similar genomic copies (paralogs, tandem duplications and pseudogenes) pose a major challenge for Short-Read High Throughput Sequencing (srHTS). High sequence similarity makes it difficult to unambiguously identify the sequences of origin of short reads. This results in misalignment artifacts which can propagate through bioinformatic pipelines and increase error rates in variant calling. ResultsWe present ParaDISM, a pipeline that refines standard alignments to improve read placement and reduce misalignment-driven false variant calls in highly homologous sequences. ParaDISM assigns a read/read pair to a sequence only when supported by unambiguous sequence-specific evidence by using a multiple sequence alignment of reference sequences to identify disambiguating positions. An optional iterative refinement procedure calls variants from confidently assigned reads, updates the reference sequences, and processes remaining non-assigned reads. We evaluated the performance of ParaDISM both in terms of read alignment and the resulting short variant calls using extensive computational simulation experiments and the Genome in a Bottle HG002 benchmark. We applied ParaDISM to reanalyze two case studies: five public tumour exomes at the GNAQ/GNAQP1 locus, and 18 short-read sequencing datasets of patients diagnosed with Autosomal Dominant Polycystic Kidney Disease (16 exomes and 2 panel sequencing datasets). Compared to the standard aligners (bowtie2, bwa-mem and minimap2), ParaDISM reduced the number of misalignment artifacts and false variant calls, resulting in an increased specificity and precision of the results. ConclusionsParaDISM improves the precision of read placement and single-nucleotide variant calling in highly homologous reference sequences. By reducing the number of false variant calls caused by misalignment artifacts, ParaDISM provides a stronger level of evidence for the called variants compared to currently available approaches. The pipeline is open source and available under the MIT license at github.com/BioGeMT/ParaDISM.

18

Determination of the practical utility of ESMO Scale for Clinical Actionability of molecular Targets (ESCAT): mapping OncoKB level 1 alterations using ESCAT

Kordes, M.; Chakravarty, D.; Boberg, E.; Creignou, M.; de Petris, L.; Karlsson, C.; Burstrom, L. L.; Suehnholz, S.; Yachnin, J.; Wiklander, O. P.; Haglund de Flon, F.

2026-05-20 oncology 10.64898/2026.05.16.26353390 medRxiv

Top 1%

0.8%

Show abstract

Background. The European Society for Medical Oncology (ESMO) Scale for Clinical Actionability of molecular Targets (ESCAT) ranks genomic alterations by the evidence supporting the predictive value of the molecular target for response to targeted therapies. No openly available, systematically curated set of standard care biomarkers mapped to the ESCAT framework exists to support clinical decision-making or harmonize biomarker interpretation. Methods. We mapped all OncoKBTM Level 1 biomarkers to ESCAT tiers using evidence cited by OncoKBTM, excluding abstract-only data. Eight board-certified oncologists and hematologists independently assigned ESCAT tiers, with discrepancies resolved through structured consensus meetings. Recurring evidence scenarios that did not correspond to any existing ESCAT tier informed a set of a priori defined modifications, which were subsequently applied to biomarkers that could not be classified using native ESCAT criteria. Results. Of 188 OncoKBTM Level 1 biomarkers, 16 were excluded due to abstract-only evidence. Using native ESCAT criteria, 51% of the remaining biomarkers were classified as Tier 1, 3% Tier 2, 18% Tier 3, 6% Tier X and 22% could not be assigned to any tier. Applying the modified ESCAT criteria resolved all previously unclassifiable biomarkers and increased Tier 1 assignments to 73%. Inter-rater reliability (Krippendorffs alpha) was moderate (0.586) and 62% of classifications required consensus discussions. Comparison with ESCAT tiers reported in ESMO Clinical Practice Guidelines showed improved concordance when using the modified criteria. Conclusions. The native ESCAT criteria are highly stringent, resulting in many FDA-recognized, clinically validated biomarkers that are currently assigned level 1 by OncoKBTM not mapping to any existing tier. Our predefined modifications improved alignment with OncoKBTM Level 1 designations and with published ESMO clinical practice guidelines. The mapped set of standard care biomarkers are provided on the OncoKBTM website, offering a practical resource that harmonizes ESCAT tiers of evidence with a widely adopted levels of evidence schema.

19

Quantifying Cancer Clinical Trial Eligibility Using Artificial Intelligence-Based Matching

Goel, K. P.; Myall, N. J.; Dickerson, J.; Caswell-Jin, J. L.; Johnson, T.; Worth, J. E.; Gensheimer, M. F.

2026-06-05 oncology 10.64898/2026.06.03.26354859 medRxiv

Top 1%

0.8%

Show abstract

PURPOSE: To develop and validate an artificial intelligence-enabled platform that converts unstructured cancer trial eligibility criteria into structured queries and quantifies trial eligibility across advanced/metastatic cancer trials. METHODS: We downloaded actively recruiting US interventional treatment trials for advanced/metastatic breast cancer, colon cancer, and non-small cell lung cancer from ClinicalTrials.gov. Medical oncologists created 24 synthetic patient vignettes. A large language model converted trial eligibility criteria into Structured Query Language (SQL) code and patient information into structured records, enabling automated matching. Cancer details and treatment history were considered, but not laboratory results or comorbidities. Validation included physician editing of generated eligibility code for 30 trials, and blinded physician eligibility assessment for five trials. We then evaluated how age, ECOG performance status, sex, and ZIP code affected the number of eligible trials. RESULTS: Of 833 candidate trials, 746 met inclusion criteria. In physician review of 30 trials, edits to generated SQL did not change any of 720 trial-patient eligibility determinations for 24 synthetic patients. In blinded validation across 120 trial-patient pairs, automated matching achieved 97% accuracy. Across synthetic patients, eligible trials ranged from 31 to 258 when there were no geographic restrictions. Eligibility decreased markedly with worse performance status and with geographic restriction (both p<0.001). Later-phase, randomized, and molecularly selective trials had fewer eligible patients. CONCLUSION: AI-based structuring of trial eligibility criteria can support accurate, scalable measurement of potential cancer trial eligibility. In this demonstration, performance status, geography, and age were major determinants of eligibility across the active metastatic trial landscape.

20

Comparative Study on Image Quality of Deep Learning and Adaptive Statistical Iterative Reconstruction-V in Thin Layer CT of liver Lesions

Yang, J.; Li, L.; Cao, J.; Zhang, J.

2026-05-26 radiology and imaging 10.64898/2026.05.23.26353923 medRxiv

Top 1%

0.7%

Show abstract

Objective:This study aims to compare the advantages and disadvantages of DLIR and adaptive statistical iterative reconstruction-V (ASIR-V) in thin-slice (2.5 mm) CT images of hepatic lesions characterized by high and low contrast. Additionally, the study seeks to determine the optimal DLIR strength for the evaluation of liver lesions. Methods:A retrospective analysis was performed on 90 patients who underwent abdominal contrast-enhanced CT scans. Group A comprised 48 patients with low-contrast lesions, while Group B included 42 patients with high-contrast lesions. The acquired images were reconstructed using post-processing DLIR at low (DLIR-L), medium (DLIR-M), and high (DLIR-H) strengths, all with a slice thickness of 2.5 mm (subgroups A1-A3, B1-B3). Furthermore, images were reconstructed with ASIR-V at 50% strength at slice thicknesses of 2.5 mm and 5 mm (subgroups A4/B4 and A5/B5, respectively). CT values and standard deviations (SD) of the liver and lesions were measured, and the corresponding signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) were calculated. The edge rise slope (ERS) was determined using ImageJ software by measuring CT values along a line from the liver parenchyma to the lesion. Objective metrics were compared using one-way ANOVA, with independent samples t-tests applied for inter-group differences. Subjective scoring, which encompassed noise level, diagnostic confidence, and lesion margin delineation, was conducted by two radiologists, with differences analyzed using the Kappa test. Results: Objective evaluation revealed a progressive decrease in lesion SD and a progressive increase in SNR and CNR from subgroups A1/B1 to A3/B3. The SD of Group A2 decreased by 57.4% compared to A4, while the SNR and CNR of A2 icreased by 19.3% and 24.6% compared to A4. Although subgroup B2 had a lower SNR than B5, the difference was not statistically significant. SNR and CNR in B2 increased by 24.1% and 11.9%, respectively, compared to B4. ERS gradually decreased from A1/B1 to A3/B3. ERS values in A2 and B2 increased by 27.0% and 39.4%, respectively, relative to A5 and B5. Although A3 had a lower ERS than A1 and A2, all DLIR subgroups exhibited higher ERS than A5; similar trends were observed in Group B. Subjective evaluation indicated good inter-reader agreement (Kappa > 0.61, p < 0.05). As DLIR strength increased, noise scores rose progressively in both groups. However, noise in A2 and B2 was lower than in A4/A5 and B4/B5. Diagnostic confidence and lesion margin delineation scores were highest in A2 and B2, while all subjective scores were lowest in A5 and B5. Discussion: Most prior studies evaluated the liver, vessels, or confirmed that image quality can be guaranteed at low doses. However, there are few studies on specific individual lesions. Therefore, this study aims to investigate specific individual lesions. The details and detection rate were analyzed separately to confirm the clinical acceptability of 2.5-mm DLIR image in different contrast lesions. Conclusion: For both high- and low-contrast hepatic lesions, DLIR provides superior image quality compared to ASIR-V, with the 2.5mm DLIR-M setting being optimal. DLIR-M reduces image noise, improves spatial resolution, and produces images more suitable for diagnostic purposes.